Practical operations of coordinated fleets of mobile robots in different environments reveal benefits of maintaining small distances between robots as they move at higher speeds. This is counter-intuitive in that as speed increases, increased distances would give robots a larger time to respond to sudden motion variations in surrounding robots. However, there is a desire to have lower inter-robot distances in examples like autonomous trucks on highways to optimize energy by vehicle drafting or smaller robots in cluttered environments to maintain communication, etc. This work introduces a model based control framework that directly takes non-linear system dynamics into account. Each robot is able to follow closer at high speeds because it makes predictions on the state information from its adjacent robots and biases it's response by anticipating adjacent robots' motion. In contrast to existing controllers, our non-linear model based predictive decentralized controller is able to achieve lower inter-robot distances at higher speeds. We demonstrate the success of our approach through simulated and hardware results on mobile ground robots.
translated by 谷歌翻译
从废物电气和电子设备(WEEE)中有效拆卸和回收材料是将全球供应链从碳密集型,采矿材料转移到可回收和可再生的材料的关键步骤。常规的回收过程依赖于切碎和分类废物流,但是对于由许多不同材料组成的Weee,我们探索了针对许多物体的靶向拆卸,以改善材料恢复。许多WEEE对象都共享许多关键特征,因此看起来非常相似,但是它们的材料组成和内部组件布局可能会有所不同,因此,对于随后的拆卸步骤,为准确的材料分离和恢复而具有准确的分类器至关重要。这项工作介绍了RGB-X(一种多模式图像分类方法),该方法利用了来自外部RGB图像的关键特征,并从X射线图像中生成的图像来准确地对电子对象进行分类。更具体地说,这项工作开发了迭代类激活映射(ICAM),这是一种新型的网络体系结构,明确地侧重于用于准确的电子对象分类所需的多模式特征映射中的细节。为了培训分类器,由于费用和需要专家指导,电子对象缺乏大型且注释良好的X射线数据集。为了克服这个问题,我们提出了一种新的方法,可以使用应用于X射线域的域随机化创建合成数据集。合并的RGB-X方法使我们在10代现代智能手机上的准确度为98.6%,其单独的精度为89.1%(RGB)和97.9%(X射线)。我们提供实验结果3来证实我们的结果。
translated by 谷歌翻译
模块化机器人可以在每天重新排列到新设计中,通过为每项新任务形成定制机器人来处理各种各样的任务。但是,重新配置的机制是不够的:每个设计还需要自己独特的控制策略。人们可以从头开始为每个新设计制作一个政策,但这种方法不可扩展,特别是给出了甚至一小组模块可以生成的大量设计。相反,我们创建了一个模块化策略框架,策略结构在硬件排列上有调节,并仅使用一个培训过程来创建控制各种设计的策略。我们的方法利用了模块化机器人的运动学可以表示为设计图,其中节点作为模块和边缘作为它们之间的连接。给定机器人,它的设计图用于创建具有相同结构的策略图,其中每个节点包含一个深神经网络,以及通过共享参数的相同类型共享知识的模块(例如,Hexapod上的所有腿都相同网络参数)。我们开发了一种基于模型的强化学习算法,交织模型学习和轨迹优化,以培训策略。我们展示了模块化政策推广到培训期间没有看到的大量设计,没有任何额外的学习。最后,我们展示了与模拟和真实机器人一起控制各种设计的政策。
translated by 谷歌翻译
Extracting complex structures from grid-based data is a common key step in automated medical image analysis. The conventional solution to recovering tree-structured geometries typically involves computing the minimal cost path through intermediate representations derived from segmentation masks. However, this methodology has significant limitations in the context of projective imaging of tree-structured 3D anatomical data such as coronary arteries, since there are often overlapping branches in the 2D projection. In this work, we propose a novel approach to predicting tree connectivity structure which reformulates the task as an optimization problem over individual steps of a recursive process. We design and train a two-stage model which leverages the UNet and Transformer architectures and introduces an image-based prompting technique. Our proposed method achieves compelling results on a pair of synthetic datasets, and outperforms a shortest-path baseline.
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
Cohn and Umans proposed a framework for developing fast matrix multiplication algorithms based on the embedding computation in certain groups algebras. In subsequent work with Kleinberg and Szegedy, they connected this to the search for combinatorial objects called strong uniquely solvable puzzles (strong USPs). We begin a systematic computer-aided search for these objects. We develop and implement constraint-based algorithms build on reductions to $\mathrm{SAT}$ and $\mathrm{IP}$ to verify that puzzles are strong USPs, and to search for large strong USPs. We produce tight bounds on the maximum size of a strong USP for width $k \le 5$, construct puzzles of small width that are larger than previous work, and improve the upper bounds on strong USP size for $k \le 12$. Although our work only deals with puzzles of small-constant width, the strong USPs we find imply matrix multiplication algorithms that run in $O(n^\omega)$ time with exponent $\omega \le 2.66$. While our algorithms do not beat the fastest algorithms, our work provides evidence and, perhaps, a path to finding families of strong USPs that imply matrix multiplication algorithms that are more efficient than those currently known.
translated by 谷歌翻译
Agile robotics presents a difficult challenge with robots moving at high speeds requiring precise and low-latency sensing and control. Creating agile motion that accomplishes the task at hand while being safe to execute is a key requirement for agile robots to gain human trust. This requires designing new approaches that are flexible and maintain knowledge over world constraints. In this paper, we consider the problem of building a flexible and adaptive controller for a challenging agile mobile manipulation task of hitting ground strokes on a wheelchair tennis robot. We propose and evaluate an extension to work done on learning striking behaviors using a probabilistic movement primitive (ProMP) framework by (1) demonstrating the safe execution of learned primitives on an agile mobile manipulator setup, and (2) proposing an online primitive refinement procedure that utilizes evaluative feedback from humans on the executed trajectories.
translated by 谷歌翻译
Curating datasets for object segmentation is a difficult task. With the advent of large-scale pre-trained generative models, conditional image generation has been given a significant boost in result quality and ease of use. In this paper, we present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions, without requiring segmentation labels. We leverage and explore pre-trained latent diffusion models, to automatically generate weak segmentation masks for concepts and objects. The masks are then used to fine-tune the diffusion model on an inpainting task, which enables fine-grained removal of the object, while at the same time providing a synthetic foreground and background dataset. We demonstrate that using this method beats previous methods in both discriminative and generative performance and closes the gap with fully supervised training while requiring no pixel-wise object labels. We show results on the task of segmenting four different objects (humans, dogs, cars, birds).
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
Automated cellular instance segmentation is a process utilized for accelerating biological research for the past two decades, and recent advancements have produced higher quality results with less effort from the biologist. Most current endeavors focus on completely cutting the researcher out of the picture by generating highly generalized models. However, these models invariably fail when faced with novel data, distributed differently than the ones used for training. Rather than approaching the problem with methods that presume the availability of large amounts of target data and computing power for retraining, in this work we address the even greater challenge of designing an approach that requires minimal amounts of new annotated data as well as training time. We do so by designing specialized contrastive losses that leverage the few annotated samples very efficiently. A large set of results show that 3 to 5 annotations lead to models with accuracy that: 1) significantly mitigate the covariate shift effects; 2) matches or surpasses other adaptation methods; 3) even approaches methods that have been fully retrained on the target distribution. The adaptation training is only a few minutes, paving a path towards a balance between model performance, computing requirements and expert-level annotation needs.
translated by 谷歌翻译